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We introduce a new implementation of time-dependent density-functional theory which allows the entire 
spectrum of a molecule or extended system to be computed with a numerical effort comparable to that of a 
single standard ground-state calculation. This method is particularly well suited for large systems and/or large 
basis sets, such as plane waves or real-space grids. By using a super-operator formulation of linearized time- 
dependent density-functional theory, we first represent the dynamical polarizability of an interacting-electron 
system as an off-diagonal matrix element of the resolvent of the Liouvillian super-operator. One-electron opera- 
tors and density matrices are treated using a representation borrowed from time-independent density-functional 
perturbation theory, which permits to avoid the calculation of unoccupied Kohn-Sham orbitals. The resolvent of 
the Liouvillian is evaluated through a newly developed algorithm based on the non-symmetric Lanczos method. 
Each step of the Lanczos recursion essentially requires twice as many operations as a single step of the iterative 
diagonalization of the unperturbed Kohn-Sham Hamiltonian. Suitable extrapolation of the Lanczos coefficients 
allows for a dramatic reduction of the number of Lanczos steps necessary to obtain well converged spectra, 
bringing such number down to hundreds (or a few thousands, at worst) in typical plane-wave pseudopotential 
applications. The resulting numerical workload is only a few times larger than that needed by a ground-state 
Kohn-Sham calculation for a same system. Our method is demonstrated with the calculation of the spectra of 
benzene, Ceo fullerene, and of chlorofyll a. 

PACS numbers: 31.15.-p 71.15.Qe 31.15.Ew 71.15.Mb 33.20.Lg 



I. INTRODUCTION 

Time-dependent density-functional theory (TDDFT) UJ 
stands as a promising alternative to cumbersome many-body 
approaches to the calculation of the electronic excitation spec- 
tra of molecular and condensed-matter systems |j2). Accord- 
ing to a theorem established by Runge and Gross in the 
mid eighties [1|, for any given initial (t = 0) state of an 
interacting-electron system, the external, time-dependent, po- 
tential acting on it is uniquely determined by the time evolu- 
tion of the one-electron density, n(r, t), for t > 0. Using this 
theorem, one can formally establish a time-dependent Kohn- 
Sham (KS) equation from which various one-particle proper- 
ties of the system can be obtained as functions of time. Un- 
fortunately, if little is known about the exchange-correlation 
(XC) potential in ordinary density-functional theory (DFT) 
EIH, even less is known about it in the time-dependent case. 
Most of the existing applications of TDDFT are based on the 
so-called adiabatic local density or adiabatic generalized gra- 
dient approximations (generically denoted in the following by 
the acronym ADFT) [5 1, which amount to assuming the same 
functional dependence of the XC potential upon density as in 
the static case. Despite the crudeness of these approximations, 
optical spectra calculated from them are in some cases almost 
as accurate as those obtained from more computationally de- 
manding many -body approaches J2). TDDFT is in principle 
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an exact theory. Progress in understanding and characteriz- 
ing the XC functional will substantially increase the predic- 
tive power of TDDFT, while (hopefully) keeping its computa- 
tional requirements at a significantly lower level than that of 
methods based on many-body perturbation theory. 

Linearization of TDDFT with respect to the strength of 
some external perturbation to an otherwise time-independent 
system leads to a non-Hermitean eigenvalue problem whose 
eigenvalues are excitation energies, and whose eigenvectors 
are related to oscillator strengths |6|. Not surprisingly, this 
eigenvalue problem has the same structure that arises in the 
time-dependent Hartree-Fock theory [7, 8 1, and the dimension 
of the resulting matrix (the Liouvillian) is twice the product of 
the number of occupied (valence) states, N v , with the number 
of unoccupied (conduction) states, N c . The calculation of the 
Liouvillian is by itself a hard task that is often tackled directly 
in terms of the unperturbed KS eigen-pairs. This approach re- 
quires the calculation of the full spectrum of the unperturbed 
KS Hamiltonian, a step that one may want to avoid when 
very large basis sets are used. The diagonalization of the re- 
sulting matrix can be accomplished using iterative techniques 
13 [10), often, but not necessarily, in conjunction with the 
Tamm-Dancoff approximation lfTT|[T2l[T3l . which amounts to 
enforcing Hermiticity by neglecting the anti-Hermitean com- 
ponent of the Liouvillian. The use of iterative diagonalization 
techniques does not necessarily entail the explicit construc- 
tion of the matrix to be diagonalized, but just the availability 
of a black-box routine that performs the product of the matrix 
with a test vector ("Hip products"). An efficient way to cal- 
culate such a product without explicitly calculating the Liou- 
villian can be achieved using a representation of the perturbed 
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density matrix and of the Liouvillian super-operator borrowed 
from time-independent density-functional perturbation theory 
(DFPT) IH [Bl H6l El US). Many applications of TDDFT 
to atoms, molecules, and clusters have been performed within 
such a framework, see for example Refs. J5] [T9j 120). This 
approach is most likely to be optimal when a small number 
of excited states is required. In large systems, however, the 
number of quantum states in any given energy range grows 
with the system size. The number of pseudo-discrete states 
in the continuum also grows with the basis-set size even in a 
small system, thus making the calculation of individual eigen- 
pairs of the Liouvillian more difficult and not as meaningful. 
This problem is sometimes addressed by directly calculating 
the relevant response function(s), rather than individual exci- 
tation eigen-pairs Q2J [9] [T71 [2T|. The price paid in this case 
is the calculation and further manipulation (inversion, mul- 
tiplication) of large matrices for any individual frequency, a 
task which may again be impractical for large systems/basis 
sets, particularly when an extended portion of a richly struc- 
tured spectrum is sought after. For these reasons, a method 
to model the absorption spectrum directly — without calculat- 
ing individual excited states and not requiring the calculation, 
manipulation, and eventual disposal of large matrices — would 
be highly desirable. 

Such an alternative approach to TDDFT, which avoids diag- 
onalization altogether, was proposed by Yabana and Bertsch 
11221 . In this method, the TDDFT KS equations are solved in 
the time domain and susceptibilities are obtained by Fourier 
analyzing the response of the system to appropriate perturba- 
tions in the linear regime. This scheme has the same computa- 
tional complexity as standard time-independent (ground-state) 
iterative methods in DFT For this reason, real-time methods 
have recently gained popularity in conjunction with the use of 
pseudopotentials (PP's) and real-space grids [23 1, and a sim- 
ilar success should be expected using plane-wave (PW) basis 
sets Il24ll25l . The main limitation in this case is that, because 
of stability requirements, the time step needed for the integra- 
tion of the TDDFT KS equations is very small (of the order of 
10~ 3 fs in typical pseudopotential applications) and decreas- 
ing as the inverse of the PW kinetic-energy cutoff (or as the 
square of the real-space grid step) 11251 . The resulting number 
of steps necessary to obtain a meaningful time evolution of 
the TDDFT KS equations may be exceedingly large. 

In a recent letter a new method was proposed ll26l to cal- 
culate optical spectra in the frequency domain — thus avoiding 
any explicit integration of the TDDFT KS equations — which 
does not require any diagonalization (of either the unperturbed 
KS Hamiltonian, or the TDDFT Liouvillian), nor any time- 
consuming matrix operations. Most important, the full spec- 
trum is obtained at once without repeating time-consuming 
operations for different frequencies. In this method, which 
is particularly well suited for large systems and PW, or real- 
space grid, basis sets, a generalized susceptibility is repre- 
sented by a matrix element of the resolvent of the Liouvillian 
super-operator, defined in some appropriate operator space. 
This matrix element is then evaluated using a Lanczos recur- 
sion technique. Each link of the Lanczos chain — that is calcu- 
lated once for all frequencies — requires a number of floating- 



point operations which is only twice as large as that needed by 
a single step of the iterative calculation of a static polarizabil- 
ity within time-independent DFPT lfl4l [T5l [T6l . This number 
is in turn the same as that needed in a single step of the iter- 
ative diagonalization of a ground-state KS Hamiltonian, or a 
single step of Car-Parrinello molecular dynamics. 

The purpose of the present paper is to provide an extended 
and detailed presentation of the method of Ref. E6l and to in- 
troduce a few methodological improvements, including a new 
and more efficient approach to the calculation of off-diagonal 
elements of the resolvent of a non Hermitean operator, and 
an extrapolation technique that allows one to substantially re- 
duce the number of Lanczos recursion steps needed to calcu- 
late well converged optical spectra. The paper is organized 
as follows. In Sec. |ll]we introduce the linearized Liouville 
equation of TDDFT, including the derivation of an expres- 
sion for generalized susceptibilities in terms of the resolvent 
of the Liovillian super-operator, the DFPT representation of 
response operators and of the Liouvillian super-operator, and 
the extension of the formalism to ultrasoft PP's 1271; in Sec. 
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we describe our new Lanczos algorithm for calculating 
selected matrix elements of the resolvent of the Liouvillian 
super-operator; in Sec. [iVjwe present a benchmark of the nu- 
merical performance of the new method, and we introduce an 
extrapolation technique that allows for an impressive enhance- 
ment of it; Sec. [V] contains applications of the new method- 
ology to the spectra of Ceo fullerene and to chlorofyll a; Sec. 
|VI|finally contains our conclusions. 



II. LINEARIZED TIME-DEPENDENT 
DENSITY-FUNCTIONAL THEORY 

The time-dependent KS equations of TDDFT read [1 J: 



.d(p v (r,t) 
1 dt 



H K s(t)ip v (r,t), 



(D 



where 



H KS (t) 



t (r,t)+v HXC (r,t) (2) 



is a time-dependent KS Hamiltonian, v ex t (r, t) and 
VHXc( r it) being the time-dependent external and Hartree 
plus XC potentials, respectively. In the above equation, 
as well as in the following, quantum-mechanical opera- 
tors are denoted by a hat, " " ", and Hartree atomic units 
(h = m = e = 1) are used. When no confusion can arise, 
local operators, such as one-electron potentials, V, will be 
indicated by the diagonal of their real-space representation, 
v(r), as in Eq. (j2j). 

Let us now assume that the external potential is split into 
a time-independent part, vl xt (r), plus a time-dependent per- 
turbation, v' ext (r, t), and let us assume that the tp's satisfy the 
initial conditions: 



^(r,Q) = ^W, 



(3) 
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where ip° are ground-state eigenfunctions of the unperturbed 
KS Hamiltonian, H° KS : 



h ks<P°v{*) = £v<P° v (r). 



(4) 



To first order in the perturbation, the time-dependent KS equa- 
tions can be cast into the form: 



* = (H KS -£ V ) <P v (r,t)+ 
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(v' ext (r,t)+v' HXC (r,t)) V ° v (r), (5) 



where 



^(r,t)=e fc -V B (r,t)-V»SW 



(6) 



are the orbital response functions, which can be chosen to be 
orthogonal to all of the unperturbed occupied orbitals, {<p°}. 

Eq. ([TJ can be equivalently expressed in terms of a quantum 
Liouville equation: 



dp(t) 

l : 

dt 



H KS (t),p(t) 



(7) 



where pit") is the reduced one-electron KS density matrix 
whose kernel reads: 



N v 



p( r , r '; t) = <Pvi r , t)<pl(r', t), 



(8) 



v=l 



and the square brackets indicate a commutator. Linearization 
of Eq. (|7]i with respect to the external perturbation leads to: 



. dm 

dt 



H° KS ,p'(t) 



vLt(t),p°~ 



0(v' 2 ), (9) 



where p° is the unperturbed density matrix, p'(t) = pit) — 
p° , V' ext is the perturbing external potential, and Vjj XC is the 
variation of the Hartree plus XC potential linearly induced by 

n'{r,t) = p'{r,r;t): 



( irbr* -^ + S HM) n ' (r ' *' )dT ' dt '- m 



In the ADFT, the functional derivative of the XC potential is 
assumed to be local in time, — txc( r , r ')d(t — 

where kxc( v , v ') is the functional derivative of the ground- 
state XC potential, calculated at the ground-state charge den- 



>(r)=n°(r) 



In this approx- 



sity,n°(r): K XC (r,r') = fa(r , ) 

imation the perturbation to the XC potential, Eq. (jTOj, reads 
therefore: 



"Wc(M) = J K{T,v')n'{v',t)dv', (1 



1) 



where ft(r,r') = | r j L r ,| + kxc( v , v ')- By inserting Eq. ( fTTj ) 
into Eq. Q, the linearized Liouville equation is cast into the 
form: 



(12) 



where the action of the Liouvillian super-operator, C, onto p', 
C ■ p 1 , is defined as: 



C-p'= H° KS ,p + V' HX c[P%P° 



(13) 



and Vjj XC [p'] is the linear operator functional of p' whose 
(diagonal) kernel is given by Eq. ( fTT) . By Fourier analysing 
Eq. §YZ\ we obtain: 



(14) 



where the tilde indicates the Fourier transform and the hat, 
which denotes quantum operators, has been suppressed in p' 
and V' ext in order to keep the notation simple. In the absence 
of any external perturbations (V ex t(oj) = 0), Eq. (14i be- 



comes an eigenvalue equation for p' , whose eigen-pairs de- 
scribe free oscillations of the system, i.e. excited states 0. 
Eigenvalues correspond to excitation energies, whereas eigen- 
vectors can be used to calculate transition oscillator strengths, 
and/or the response of system properties to any generic exter- 
nal perturbation. 

One is hardly interested in the response of the more general 
property of a system to the more general perturbation. When 
simulating the results of a specific spectroscopy experiment, 
one is instead usually interested in the response of a specific 
observable to a specific perturbation. The expectation value 
of any one-electron operator can be expressed as the trace of 
its product with the one-electron density matrix. The Fourier 
transform of the dipole linearly induced by the perturbing po- 
tential, V' ext , for example, reads therefore: 



d(w) =Tr(fp' (to)) 



(15) 



where f is the quantum-mechanical position operator, and p' is 



the solution to Eq. ( 14 1. Let us now suppose that the external 
perturbation is a homogeneous electric field: 



Kxti r ,u) = -E(w)t. 
The dipole given by Eq. ( fT5] > reads therefore: 



di{ui) 
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(16) 



(17) 



where the dynamical polarizability, a,-j(u;), is defined by: 

aij(u) = -Tr (h(io - C)' 1 ■ [f 3 ,p°}) . (18) 

Traces of products of operators can be seen as scalar products 
defined on the linear space of quantum mechanical operators. 
Let A and B be two general one-electron operators. We define 
their scalar product as: 



(A\B) = Tr 



(19) 
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Eq. (JTSj can thus be formally written as: 

<*ij(u) = - (fi\(u - C)- 1 ■ sj) 

where 

§ j = [fj,P°] 



(20) 



(21) 



is the commutator between the position operator and the un- 
perturbed one-electron density matrix. The results obtained 
so far and embodied in Eq. ( pO) can be summarized by say- 
ing that within TDDFT the dynamical polarizabilty can be ex- 
pressed as an appropriate off-diagonal matrix element of the 
resolvent of the Liouvillian super-operator. A similar conclu- 
sion was reached in Ref. ifTTl in the context of a slightly differ- 
ent formalism. This statement can be extended in a straight- 
forward way to the dynamic linear response of any observable 
to any local one-electron perturbation. It is worth noticing 
that the operators that enter the definition of the scalar prod- 
uct in Eq. ( |2"0] > are orthogonal because f i is Hermitean and 
§j anti-Hermitean (being the commutator of two Hermitean 
operators), and the trace of the product of one Hermitean and 
one anti-Hermitean operators vanishes. 



A. Representation of density matrices and other one-electron 
operators 



The calculation of the polarizability using Eqs. ( 18 1 or 



implies that we should be able to compute (£ — oj) -i • [fj,p° 
in a super-operator linear system. The latter task, in turn, 
requires an explicit representation for the density-matrix re- 
sponse, p', for its commutator with the unperturbed Hamilto- 
nian, for local operators, such as fj of V^ xc , for their com- 
mutators with the unperturbed density matrix, as well as for 
the Liouvillian super-operator, or at least for its product with 
any relevant operators, A, such as C ■ A. 

A link between the orbital and density-matrix representa- 
tions of TDDFT expressed by Eqs. ([5]) and (|9]i can be obtained 
by linearizing the expression ^ for the time-dependent den- 
sity matrix: 

p'(T, r ';t) =^[^(r)^(r',t) + ^(r,t)^*(rO], (22) 

V 

whose Fourier transform reads: 

p'(r,r';u)) = 

J2 [^(r)^(r', -u) + <p' v (r, u) V a v * (r')] . (23) 

V 

Eq. ( f23| shows that p(uo) is univocally determined by the 
two sets of orbital response functions, x' = {<p' v (r, oj)} and 
y' = {(p''*(r, — L))}. A set of a number of orbitals equal to 
the number of occupied states, such as x' or y', will be nick- 
named a batch of orbitals. Notice that p(oj) is not Hermitian 
because the Fourier transform of a Hermitian, time-dependent, 
operator is not Hermitian, unless the original operator is even 
with respect to time inversion. Because of the orthogonality 



between occupied and response orbitals ((<Pv\<p' v i) = 0), Eq. 
p2| implies that the matrix elements of p' between two un- 
perturbed KS orbitals which are both occupied or both empty 
vanish (p' vv , — p' cc / — 0), as required by the idempotency of 
density matrices (p 2 = pi) in DFT As a consequence, in order 
to calculate the response of the expectation values of a Hermi- 
tian operator, A, such as in Eq. (15) , one only needs to know 
and represent the occupied-empty (vc) and empty-occupied 
(cv) matrix elements of A, A vc and A cv . In other terms, if 
we define as P = J2 V \<pZ){<pZ\ = /5° and Q = 1 - P as 
the projectors onto the occupied- and empty-state manifolds, 
respectively, one has that: 



Tr(Ap'(u))=Tr(A'p'(uj)), 



(24) 



where A' — PAQ + QAP is the vc-cv component of A, 
which can be easily and conveniently represented in terms of 
batches of orbitals. To this end, let us define the orbitals: 

<(r) = Qi^(r)=^^(r)4 CT , (25) 

C 

o»(r) = (QiV°(r))*=5>°*(r)A«, (26) 

C 

One has then: 



A, 



(<P>v), 



(27) 
(28) 



If Eqs. (27 1 and (28 i are used to represent density matrices, 



then the free oscillations corresponding to setting V' ext = in 
Eq. ( p"4| ) would be described by Casida's eigenvalue equations 

m. 

For simplicity and without much loss of generality, from 
now on we will assume that the unperturbed system is time- 
reversal invariant, so that the unperturbed KS orbitals, ip° and 
ip°, can be assumed to be real. The two batches of orbitals 
& x = {a^(r)} and & y = {a^(r)} will be called the batch rep- 
resentation of the A operator, and indicated with the notation 
(& x ,& v ) or ({a x },{af,}). Scalar products between operators 
(traces of operator products) can be easily expressed in terms 
of their batch representations. Let ({b*}, {6^}) be the batch 
representation of the operator B. If either of the two opera- 
tors, A or B, has vanishing vv and cc components, one has: 



A\B) = Trfi+S 



(A* V B CV + A* VC B VC ) 



(29) 



If A is Hermitian, its batch representation satisfies the rela- 
tion: a y (r) = a x (r)* , whereas anti-Hermiticity would imply: 
a y (r) = —a x (r)*. Due to time-reversal invariance and the 
consequent reality of the unperturbed KS orbitals, the batch 
representation of a real (imaginary) operator is real (imag- 
inary), and the batch representation of a local operator, V 
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(which is Hermitean, when real, or non Hermitean, when 
complex), satisfies: v%(r) = v%(r). 



In order to solve the super-operator linear system, Eq. ( 14 1 



using the batch representation, one needs to work out the batch 
representation of Vjj XC (r,uj) as a functional of p', as well as 
of the various commutators appearing therein. The charge- 
density response to an external perturbation reads: 

n'(r) = ]>>>)(^(r, W ) + ^;(r',-u,)) 

V 

= ^°(r)(4(r) + ^(r)), (30) 

V 

where ({xj,}, {y' v }) is the batch representation of the density- 
matrix response, p' . The Hartree-plus-XC potential response 
is: 



v' HX c[p'](r) = J K{v,v')n'{v')dv' 

x(4(r') + ^(r'))dr'.(31) 

Using Eqs. ( |25) and ( f26] i the batch representation of the 
Hartree-plus-XC potential response reads therefore: 



HXC 



OX / ^°(r) K (r,r')^(r') 

v' 

y.{x' v ,{v') + y' vl {v'))dv' (32) 

v' 

x(x' v ,(r')+y' v ,{r , ))dr' (33) 



,,'v 

J HXC,v 



(r) 



v ' X HXC,v{ v )i 



where: 



^w(r,r')- K (r,r')^(r)^,(r') 



(34) 



(35) 



[See Eq. (fTTj)]. Let {«'*}) be the batch representa- 

tion of a local operator, V'. The batch representation of the 
commutator between V' and the unperturbed density matrix, 
V" = [V',p°], reads: 



«"*(r) = Q[\>',p°]^(r) 
t/'»(r) = -v"*(r). 



(36) 
(37) 



The batch representation of the commutator between the un- 
perturbed Hamiltonian and the density-matrix response, p" — 
[H°,p'], reads: 



<(r) = Q[H°,j?]<pl(T) 



(H° - s v )x' v (r) 
-(H° - E v )y' v (r)- 



(38) 
(39) 



The batch representation of the product of the Liouvillian with 
the density-matrix response (£ • p') appearing in Eq. ( fT4| ) 
reads: 



C 



V + IC K 
-K -V-K, 



(40) 



where the action of the T> and JC super-operators on batches 
of orbitals is defined as: 



V{x v (r)} = {(H° - e v )x v (r)} 



(41) 



£{xv(r)} 



y Xw(r,r')^(r')*'|. (42) 



Note that, according to Eqs. ( |40] >, ( |4T] ), and ( |42] i, the calcu- 
lation of the product of the Liouvillian with a general one- 
electron operator in the batch representation only requires 
operating on a number of one-electron orbitals equal to the 
number of occupied KS states (number of electrons), with- 
out the need to calculate any empty states. In particular, the 
calculation of Eq. ( |42| is best performed by first calculating 
the HXC potential generated by the fictitious charge density 

"-( r ) = J2 V x v{ T ) ( Pv( r )> an( ^ trien a PPly m 8 ^ to eacn unper- 
turbed occupied KS orbital, ip° (r) . The projection of the re- 
sulting orbitals onto the empty-state manifold implied by the 
multiplication with Q is easily performed using the identity: 

Q = 1 — J2 V IvSXvSI' as i 1 i s common practice in DFPT 

Following Tsiper [28 1, it is convenient to perform a 45° 
rotation in the space of batches and define: 



Qv(r) = ^(x v (r) + y v (r)) 
Pv(r) = -(x v (v)-y v (r)). 



(43) 
(44) 



Eqs. ( |43| l and ( |44| l define the standard batch representation 
(SBR) of the density-matrix response. The SBR of the re- 
sponse charge density is: 



i'(r) = 2 5»)3„(r). 



(45) 



The SBR of a general one-electron operator is defined in a 
similar way. In particular, the SBR of a real Hermitian opera- 
tor has zero p component, wheres the SBR of the commutator 
of such an operator with the unperturbed density matrix has 
zero q component. The standard batch representation of the 
TDDFT Liouville equation, Eq. ( fl4| >, reads: 



uo -V 
-V-2K. uj 





{Qv ext (r)<p° v (r)} 



(46) 

In conclusion, the batch representation of response density 
matrices and of general one-electron operators allows one to 
avoid the explicit calculation of unoccupied KS states, as well 
as of the Liouvillian matrix, which is mandatory when (very) 
large one-electron basis sets (such as PW's or real-space grids) 
are used to solve the ground-state problem. This representa- 
tion is the natural extension to the time-dependent regime of 
the practice that has become common since the introduction 
of time-independent DFPT |[T4l [TBI l30l . 
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B. Ultra-soft pseudopotentials 

The formalism outlined above applies to all-electron as 
well as to pseudopotential calculations performed using norm- 
conserving pseudopotentials, which give rise to an ordinary 
KS ground-state eigenvalue problem. Ultra-soft pseudo- 
potentials (USPP's) ||27l , instead, give rise to a generalized 
KS ground-state eigenvalue problem and the time evolution 
within TDDFT has to be modified accordingly E4l l25l . The 
generalization of the TDDFT formalism to USPP's has been 
presented in full detail in Ref. 11251 . and here we limit our- 
selves to report the main formulas. 

In the framework of USPP's, the charge density is written 
as a sum n(r, t) = nP s (r,t) + n aus (r,t). The delocalized 
contribution, n us , is represented as the sum over the squared 



Eq. (|5j, also contains the overlap operator in the USPP for- 
malism: 

lS -Q t = [ H KS ~ Se v) ^i,M) + 

+ (50) 

Using the same derivation as before, but starting from 
Eq. ([50]) instead of Eq. Q, we arrive at a standard batch rep- 
resentation of the TDDFT Liouville equation in the USPP for- 



malism. It has the same form as Eq. (46 1 above, but with the 
super-operators T> and JC replaced by: 



V Ui >{x v (r)} = {(S- 1 H°-e v )x v (r)} 



(51) 



moduli of the KS orbitals: 



,us 



M) 



augmentation charge, n aug , instead, is written in terms of so 
called augmentation functions Q^ TO (r) 



um over the squared ( f ] 

KM)| 2 - The fC US {x v (v)} = S-'QY, J K vv ,{r,r')x v ,{r')dr' , 
ritten in terms of so- I •»' ) 
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Q£ m (r)<M*M)</&M*)}- (47) and the right-hand side of Eq. g§ by 



The augmentation functions, as well as the functions /3^(r) = 
(3 n (r — R/ ) are localized in the core region of atom /. The /3's 
each consist of an angular momentum eigenfunction times a 
radial function that vanishes outside the core radius. Typically 
one or two such functions are used for each angular momen- 







(52) 



(53) 



In this case the projector onto the empty-state manifold is de- 
fined as 



turn channel and atom type. The indices n and m in Eq. (47 1 
run over the total number of such functions for atom /. In 
practice, the functions Q nm (r) and (3 n (r) are provided with 
the pseudopotential for each type of atom. 

The advantage of using USPP's over standard norm- 
conserving pseudopotentials comes from this separation of the 
strongly localized contributions to the charge density from the 
more delocalized contributions. The square moduli of the KS 
orbitals only represent the latter part of n(r, t), and therefore 
fewer Fourier components in the representation of the orbitals 
are sufficient for a correct representation of the charge density. 
As a consequence the kinetic energy cutoff which determines 
the size of the basis set can be chosen much smaller in typi- 
cal USPP applications than in corresponding calculations with 
norm-conserving PP's. As shown in Ref.l25l the smaller ba- 
sis set not only reduces the dimensions of the matrices during 
the computation, but it allows also for a faster convergence of 
spectroscopic quantities, when calculated both with real-time 



Q = S-J2s\<Pl)(<Pl\ 



(54) 



The inverse overlap operator, S 1 , appearing in these expres- 
sions can be cast in the form 



s- 1 = 1 



n.m.I ,J 



(55) 



which is very similar to the S operator itself, given in Eq. (49 1 



except fact that S 1 generally connects /3-functions localized 
on different atoms. The numbers can be obtained from 
the condition SS^ 1 = 1. If the atoms are kept at fixed posi- 
tions, as it is the case here, the overlap operator is independent 
of time and the A's need to be calculated only once for all. 



or with spectral Lanczos techniques (see Sec. IIIi. 

The generalized expression for the USPP charge density 
given above entails a more complicated structure of the KS 
eigenvalue problem. Instead of the standard eigenvalue equa- 
tion d4l, one now has 



GENERALIZED SUSCEPTIBILITIES FROM 
LANCZOS RECURSION CHAINS 



H° KS <p° v (r) = e v S<p° v (r), 
where the overlap operator S is defined as 



(48) 



(49) 



n,m,7 



with q I nm = J drQ^ m (r) and 1 the identity operator. Con- 
sequently, the equation for the time-dependent KS orbitals, 



III. 



According to Eq. ( |20] i, the polarizability can be expressed 
as an appropriate off -diagonal matrix element of the resolvent 
of the non-Hermitian Liouvillian (super-) operator between 
two orthogonal vectors. The standard way to calculate such 
a matrix element is to solve first a linear system whose right- 
hand side is the ket of the matrix element, and to calculate 
then the scalar product between the solution of this linear sys- 
tem with the bra (9] [TT) . The main limitation of such an ap- 
proach is that solving linear systems entails the manipulation 
and storage of a large amount of data and that a different lin- 
ear system has to be solved from scratch for each different 
value of the frequency. In the case of a diagonal element of 



7 



a Hermitian operator, a very efficient method, based on the 
Lanczos factorization algorithm 0T1 p. 185 wdff.] is known, 
which allows to avoid the solution of the linear system alto- 
gether l32l [33l l34l [35l . Using such a method (known as the 
Lanczos recursion method) a diagonal matrix element of the 
resolvent of a Hermitean operator can be efficiently and el- 
egantly expressed in terms of a continued fraction generated 
by a Lanczos recursion chain starting from the vector with 
respect to which one wants to calculate the matrix element 
11321 [33l l34l [351 . The generalization of the Lanczos recur- 
sion method to non-Hermitian operators is straightforward, 
based on the Lanczos biorthogonalization algorithm l36l p. 
503]. This generalization naturally applies to the calculation 
of an off-diagonal matrix element between vectors that are 
not orthogonal. Less evident is how to encompass the cal- 
culation of off-diagonal matrix elements between orthogonal 
vectors. In Ref. [26 1 such matrix elements were treated us- 
ing a block version of the Lanczos bi-orthogonalization. This 
approach has the drawback that a different Lanczos chain has 
to be calculated for the response of each different property to 
a given perturbation (i.e. for each different bra in the matrix 
element corresponding to a same ket). In the following, we 
generalize the recursion method of Hay dock, Heine, and Kelly 
Il32l[33ll34l[35l . so as to encompass the case of an off-diagonal 
element of the resolvent of a non-Hermitean operator without 
resorting to a block variant of the algorithm and allowing to 
deal with the case in which the left and the right vectors are 
orthogonal. This will allow us to calculate the full dynamical 
response of any dynamical property to a given perturbation, 
from a single scalar Lanczos chain. 

We want to calculate quantities such as: 



9 (oj) = (u\(u-A)- 1 v) 



(56) 



where A is a non-Hermitean matrix defined in some linear 
space, whose dimension will be here denoted n, and u and 
v are elements of this linear space, which we suppose to be 
normalized: || u || = || v \\— 1, where || v || 2 = (v\v). For 
simplicity, and without loss of generality in view of appli- 
cations to time-reversal invariant quantum-mechanical prob- 
lems, we will assume that the linear space is defined over real 
numbers. Let us define a sequence of left and right vectors, 
{pi,p 2 , ■ ■ ■ Pk, ■ ■ ■ } and {qi, q 2 , ■ ■ ■ qk, ■ ■ • }, from the follow- 
ing procedure, known as the Lanczos bi-orthogonalization al- 
gorithm E>1 p. 503]: 



7i 5o = P1P0 

qi =pi 

Pj+iqj+i 



where: 



0, (57) 

«, (58) 

A q, <v/., - ,-/, 1. (59) 

A T pj - cxjpj - (3jPj-i, (60) 



(61) 



and (3j+i and 7j+i are scaling factors for the vectors qj+\ and 
Pj+i, respectively, so that they will satisfy: 



Thus, from an algorithmic point of view, the right-hand sides 
of Eqs. ( 59"]|60 i are evaluated first with a.j obtained from Eq. 
( |6Tj ). Then, the two scalars Pj+i and jj+i are determined so 
that Eq. ( |62"f i is satisfied. Eq. ( |62| > only gives a condition on 
the product of f3j+i and 7^+1. If we call q and p the vectors 
on the right-hand sides of Eqs. |59]l, ( |60"| ) respectively, this 
condition is that j3j +1^+1 = (q\p). In practice one typically 
sets: 



7i 



-1 = V\W)\ 

-1 = sign((q|p)) x /3 j+1 . 



(63) 
(64) 



The set of q and p vectors thus generated are said to be links 
of a Lanczos chain. In exact arithmetics, it is known that 
these two sequences of vectors are mutually orthogonal to 
each other, i.e., (qi\pj) = Ay, where is the Kronecker 
symbol. 

The resulting algorithm is described in detail, e.g., in Refs. 
ll3Tll36l . Let us define Q- 7 and P J as the (n x j) matrices: 



Q 3 = [<?i,92, • • • ,q 3 ], 

P ] = \px,P2, ■ ■ ■ ,Pj], 



(65) 
(66) 



and let e™ indicate the fc-th unit vector in a m-dimensional 
space (when there is no ambiguity on the dimensionality of 
the space, the superscript j will be dropped). The following 
Lanczos factorization holds in terms of the quantities calcu- 
lated from the recursions equations ( 58]|60 i: 



A Q J 
A T P 3 



Q J I'' + Pj+iqj+ie- 
P^' T + 7,-+iP,-+ie; 



j ' 
3 ' 



(67) 
(68) 
(69) 



where P indicates the (j x j) unit matrix, and T- 7 is the (j x j) 
tridiagonal matrix: 



(70) 
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In the present case, because of the special block structure of 
the Liouvillian super-operator and of the right-hand side ap- 
pearing in Eq. ( ftp*) , at each step of the Lanczos recursion one 
has that Cqj is always orthogonal to pj, so that, according to 
Eq. §6ty , ctj = 0. Let us now rewrite Eq. (67 1 as: 



(w - A)Q ] = Q^uj - T j ) - /3 j+1 q j+1 e 



(71) 



By multiplying Eq. (71 1 by u T (uj — A) 1 on the left and by 
[u> — Ti)~ 1 e\ on the right, we obtain: 



(lj+i\P. 



(62) 



f3 j+1 u T (uj - A)~ 1 q j - 



3 P 3. 



-AY X Q- 
_ ie f (w-TJ) 



-1^ 



(72) 
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Taking the relation Qje\ 
be cast as: 



31 



v into account, Eq. ( 72 1 can 



<?H = (C 



/Hi 



(73) 



(74) 



where: 

is an array of dimension j, and: 

£ » = /3 j+1 (u\(w - A)- l q j+1 ) (e>\(w - T^e{) . 

(75) 

is the error made when truncating the Lanczos chain at the 
j-th step. Neglecting Sj(w) we arrive at the following approx- 
imation to g(u) defined in Eq. d56b 



(76) 



This approximation is the scalar product of two arrays of di- 
mension j: §j(uj) — where rf is obtained by solving 
a tridiagonal linear system: 



(u - T j )rf> 



(77) 



T 3 is the tridiagonal matrix of Eq. d70l, and C, 3 is given by Eq. 

Three important practical observations should be made at 
this point. The first is that solving tridiagonal systems is ex- 
tremely inexpensive (its operation count scales linearly with 
the system size). The second is that the calculation of the 
sequence of vectors Q from Eq. ( f74| ) does not require the 
storage of the Qj matrix. In fact, each component C, 3 is the 
scalar product between one fixed vector (u) and the Lanczos 
recursion vector q 3 , and it can be therefore calculated on the 
fly along the Lanczos recursion chain. We note that a slightly 
better approach to evaluating Eq. (j76j would be via the LU 



factorization of the matrix u> — T 3 . If w 



T- = 



then g{uf) — (C yr jJC' J |^Jj e i)' which can be implemented 
as the scalar product of two sequences of vectors. We finally 
observe that the components of C, 3 decrease rather rapidly as 
functions of the iteration count, so that only a relatively small 
number of components have to be explicitly calculated. This 
will turn out to be essential for extrapolating the Lanczos re- 
cursion, as proposed and discussed in Sec. IV The compo- 



nents of rf — (u — T 3 ) e 3 also tend to decrease, although 
not as rapidly. In fact this is used to measure convergence of 
the Lanczos, or Arnoldi algorithms for solving linear systems, 
see, e.g., [37]. 

From the algorithmic point of view, much attention is usu- 
ally paid in the literature to finding suitable preconditioning 
strategies that would allow one to reduce the number of steps 
that are needed to achieve a given accuracy within a given iter- 
ative method |9 1. Although preconditioning can certainly help 
reduce the number of iterations, it will in general destroy the 
nice structure of the Lanczos factorization, Eq. ( |67] >, which is 
essential to avoid repeating the time-consuming factorization 
of the Liouvillian for different frequencies. In the next sec- 
tion we will show how a suitable extrapolation of the Lanczos 



coefficient allows for a substantial reduction of the number 
of iterations without affecting (but rather exploiting) the nice 
structure of the Lanczos factorization, Eqs. ( |68j ) and ( |67j ). 

We conclude that the non-symmetric Lanczos algorithm al- 
lows one to easily calculate a systematic approximation to 
the off-diagonal matrix elements of the resolvent of a non- 
Hermitean matrix. It is easily seen that, in the case of a di- 
agonal matrix element, this same algorithm would lead to a 
continued-fraction representation of the matrix element. Al- 
though the representation of Eq. ( |73"j ), which is needed in 
the case of a non-diagonal element, is less elegant than the 
continued-fraction one, its actual implementation is in prac- 
tice no more time consuming from the numerical point of 
view. 

The idea of using the Lanczos algorithm to compute func- 
tions such the one in Eq. ( |56l > is not new. In control theory, 
this function is called a transfer function and it is used to an- 
alyze the frequency response of a system much like it is done 
here. Using the Lanczos algorithm for computing transfer 
functions has been considered in, e.g., 1 38 . 39 1. The Lanczos 
and Arnoldi methods are also important tools in the closely 
related area of model reduction in control theory, see, e.g., 



IV. BENCHMARKING THE NEW ALGORITHM AND 
ENHANCING ITS NUMERICAL PERFORMANCE 

In this section we proceed to a numerical benchmark of 
the new methodology against the test case of the benzene 
molecule, a system for which several TDDFT studies already 
exist and whose optical spectrum is known to be accurately 
described by ADFT |23j|34l[26]|4T|. A careful inspection of 
the convergence of the calculated spectrum with respect to the 
length of the Lanczos chain allows us to formulate a simple 
extrapolation scheme that dramatically enhances the numeri- 
cal performance of our method. All the calculations reported 
in the present paper have been performed using the Quantum 
ESPRESSO distribution of codes for PW DFT calculations 
ll42l . Ground-state calculations have been performed with 
the PWscf code contained therein, whereas TDDFT linear- 
response calculations have been performed with a newly de- 
veloped code, soon to be included in the distribution. 



A. Numerical benchmark 

The benchmark has been performed using the Perdew- 
Burke-Ernzerhof (PBE) E3 XC functional and USPP's |25J 
l27l [44 1 with a PW basis set up to a kinetic energy cut-off of 
30 Ry (180 Ry for the charge density). This corresponds to 
a wavefunction basis set of about 25000 PW's, resulting in a 
Liouvillian superoperator whose dimension is of the order of 
750,000. Periodic boundary conditions have been used, with 
the molecule placed horizontally flat in a tetragonal supercell 
of 30 x 30 x 20 aj]. The absorption spectrum is calculated as 
I(oj) oc wlm (a(uj)), where a is the spherical average (aver- 
age of the diagonal elements) of the molecular dipole polar- 
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Figure 1: Absorption spectrum calculated using Lanczos method 
with ultrasoft pseudo-potentials. The figure shows the curve at dif- 
ferent numbers of recursive steps; a vertical shift has been introduced 
for clarity. 



izability. A small imaginary part has been added to the fre- 
quency argument, uj — > ui + it, so as to regularize the spec- 
trum. This shift into the complex frequency plane has the ef- 
fect of introducing a spurious width into the discrete spectral 
lines. In the continuous part of the spectrum, truncation of the 
Lanczos chain to any finite order results in the discretization 
of the spectrum, which appears then as the superposition of 
discrete peaks. The finite width of the spectral lines has in 
this case the effect of broadening spectral features finer than 
the imaginary part of the frequency, thus re-establishing the 
continuous character of the spectrum. The optimal value of 
the imaginary part of the frequency is slightly larger than the 
minimum separation between pseudo-discrete peaks and de- 
pends in principle on the details of the system being studied, 
as well as on the length of the Lanczos chain and on the spec- 
tral region. Throughout our benchmark we have rather arbi- 
trarily set e = 0.02 Ry. Later in this section, we will see that 
the length of the Lanczos chain can be effectively and inex- 
pensively increased up to any arbitrarily large size. By doing 
so, the distance between neighboring (pseudo-) discrete states 
in the continuum correspondingly decreases, thus making the 
choice of e noncritical. 

In Fig. [T]we report our results for the absorption spectrum 
of the benzene molecule. The agreement is quite good with 
both experimental data fl31 and previous theoretical work 
EH EH EH ED. Above the ionization threshold the TDDFT 
spectrum displays a fine structure (wiggles), which is not ob- 
served in experiments and that was suggested in Ref. PTI 
to be due to size effects associated to the the use of a finite 




CO [eV] 



Figure 2: Comparison with experimental results of the converged 
spectrum of benzene for two different sizes of the cell; for the larger 
cell the structure in the continuum decreases and reproduces the ex- 
perimental curve better. Theoretical results have been scaled so as to 
obtain the same integrated intensity as experimental data. 



simulation cell. Finite-size effects on the fine structure of the 
continuous portion of the spectrum are illustrated in Fig. [2] 
where we display the spectrum of benzene as calculated using 
two simulation cells of different size. 

Our purpose here is not to analyze the features of the ben- 
zene absorption spectrum, which are already rather well un- 
derstood (see, e.g., Ref. l26b . nor to dwell on the comparison 
between theory and experiment, but rather to understand what 
determines the convergence properties of the new method and 
how they can be possibly improved. The number of iterations 
necessary to achieve perfect convergence lies in this case in- 
between 2000 and 3000: the improvement with respect to Ref. 
Il26l is due to the smaller basis set, made possible by the use 
of ultrasoft pseudo-potentials, as discussed in Ref. Il25ll . It is 
worth noting that the convergence is faster in the low-energy 
portion of the spectrum. This does not come as a surprise be- 
cause the lowest eigenvalues of the tridiagonal matrix gener- 
ated by the Lanczos recursion converge to the corresponding 
lowest eigenvalues of the Liouvillian, and the lower the state 
the faster the convergence. 

A comparison between the performance of the new method 
with a more conventional approach based on the diagonaliza- 
tion of the Liouvillian is not quite possible because the two 
methodologies basically address different aspects of a same 
problem. While the former addresses the global spectrum of 
a specific response function, the latter focuses on individual 
excited states, from which many different response functions 
can be obtained, at the price of calculating all of the individual 
excited states in a given energy range. It suffices to say that 
it would be impractical to obtain a spectrum over such a wide 
energy range as in Fig. |T|by calculating all the eigenvalues of 
a Liouvillian. Using a localized basis set, which is the com- 
mon choice in most implementations of Casida's equations, 
it would be extremely difficult to resolve the high lying por- 
tion of the one-electron spectrum with the needed accuracy; 
using PW or real-space grid basis sets, instead, the calculation 
of very many individual eigen-pairs of the Liouvillian matrix 
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Figure 3: (a) Numerical behavior of the components of the £ J vector 
given by equation [74] Apart for some out of scale oscillation they 
tend rapidly to a value near zero, (b) Numerical behavior of f3j co- 
efficients given by Eq. l |63[ >. They tend rapidly to a constant value 
even if some larger scale oscillation is present. In the inset the same 
data are shown on a different scale and with different colors for odd 
(green) and even (red) coefficients. 



whose dimension easily exceeds several hundreds thousands 
would be a formidable task. 



The comparison with time-propagation schemes is instead 
straightforward and more meaningful. Typical time steps and 
total simulation lengths in a time propagation approach are of 
the order of 10~ 18 s, and 10 -14 s, respectively, which amounts 



to about 10,000 time propagation steps [25|. The computa- 
tional workload at each time step depends on the propagation 
algorithm. One commonly used technique relies on a fourth- 
order Taylor expansion of the propagator, together with so- 
called enforced time-reversal symmetry 1461 . In this case, each 
time step requires eight applications of the Hamiltonian to the 
KS orbitals and one evaluation of the Hartree plus exchange- 
correlation potentials. In the Lanczos approach, each step re- 
quires two applications of the Hamiltonian and one evaluation 
of the potentials. Furthermore, the response orbitals must be 
kept orthogonal to the ground-state orbitals. This results in a 
computational effort which is sensibly lower for one recursion 
step than for one time propagation step. Considering both the 
larger number of propagation steps and the more expensive 
workload at each step, we can conclude that our approach is 
definitely more efficient than the time-propagation method to 
compute linear response spectra. 



B. Analysis 

In Fig. [3] we report the values of the (3 coefficients and 
of the last component of the ( vectors (see Eqs. 63 and 74 1, 



as functions of the Lanczos iteration count, calculated when 
the direction of both the perturbing electric field and the ob- 
served molecular dipole are parallel to each other and lying 
in the molecular plane (this would correspond to calculating, 
say, the xx component of the polarizability tensor). It is seen 
that the £ components rapidly tend toward zero, whereas the 
f3's tend to a constant. Closer inspection of the behavior of 
the latter actually shows that the values of the /3's are scat- 
tered around two close, but distinct, values for even and odd 



iteration counts. The 7 coefficients (see Eq. 64 1 are in general 
equal to the /3's, and only in correspondence with few iterative 
steps they assume a negative sign. 

All the calculated quantities, /3, 7, and £, are subject to 
occasional oscillations off their asymptotic values. The ob- 
served oscillations in the coefficients jj and j3j can be partly 
explained from their definitions, namely Eqs. ( |63]|64 



Note 

m. 



at first that there is a risk of a division by zero in Eq. 
The occurrence of a zero scalar product (q\p) is known as a 
breakdown. Several situations can take place. A lucky break- 
down occurs when one of the vectors q or p is zero. Then the 
eigenvalues of the tridiagonal matrix are exact eigenvalues of 
the matrix A, as the space spanned by Q 3 (when q = 0) be- 
comes invariant under A, or the space spanned by P J (when 
p = 0) becomes invariant under A T . Another known situa- 
tion is when neither q nor p are zero but their inner product 
is exactly zero. This situation has been studied extensively in 
the literature: see, e.g., 1471 148, 49 1. One of the main results 
is that when this breakdown takes place at step j say, then it is 
often still possible to continue the algorithm by essentially by- 
passing step j and computing qj + 2,Pj+2> or some qj+i,Pj+i 
where I > 1, directly. Intermediate vectors are needed to 
replace the missing qj+i, ...qj+1-1 and Pj+%, ...pj+i-x, but 
these vectors are no longer bi-orthogonal, resulting in the 
tridiagonal matrix being spoiled by bumps in its upper part. 
The class of algorithms devised to exploit this idea are called 
look-ahead Lanczos algorithms (LALAs), a term first em- 
ployed in J47]. Finally an incurable breakdown occurs when 
no pair q i+ i,pj + i with some I > 1 can be constructed which 
has the desired orthogonality properties. Note that this type 
of breakdown cannot occur in the Hermitian Lanczos algo- 
rithm, because it is a manifestation of the existence of vectors 
in the right subspace (linear span of Q J ) that are orthogonal to 
all the vectors of the left subspace (linear span of P- 7 ), which 
is impossible when these spaces are the same (Q 3 = P J in 
the Hermitian case). Clearly, exact breakdowns (inner prod- 
uct (q\p) exactly equal to zero) almost never occur in prac- 
tice. Near breakdowns correspond to small values of these 
inner products that determine the observed jumps in the co- 
efficients j3j , 7j .The components of the Q 's can also show 
jumps in their magnitude since the vectors q 3 will occasion- 
ally display large variations in norm. In finite-precision arith- 
metics the occurrence and precise location of (near-) break- 
downs would also depend on the numerical details of the im- 
plementation. Nevertheless in our experience the Lanczos re- 
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Figure 4: Convergence of the absorption spectrum of benzene using 
the extrapolation procedure described in the text. After N iterations 
the components of are set to zero and the f3's are extrapolated. 
The curves have been shifted vertically for clarity. 

cursion always converges to the same final spectrum whose 
calculation is therefore robust. 

In order to understand what determines this robustness, 
we note that our algorithm amounts to implicitly solving a 
linear system by an iterative procedure based on a Lanczos 
scheme. This procedure is mathematically equivalent to the 
Bi-Conjugate Gradient algorithm (BiCG) ll37l . The observed 
robustness is therefore consistent with what is known of BiCG 
11371 . In BiCG, the vector iterates lose their theoretical (bi-) 
orthogonality and the scalars used to generate the recurrence 
may correspondingly display very large oscillations, yet the 
solution of the linear system, which is a linear combination 
of the vector iterates, usually converges quite well. Because 
of this inherent robustness of the algorithm, we preferred not 
to use any of the several available LALAs. The shortcom- 
ings that these algorithms are designed to cure not being crit- 
ical, the marginal advantages that they may possibly provide 
are outweighed by the drawback of losing the nice tridiagonal 
structure of the Tj matrices generated by them. 

Another difficulty with generic Lanczos algorithms is the 
loss of bi-orthogonality of the Lanczos vectors. As was men- 
tioned earlier, in exact arithmetic, the left and right Lanc- 
zos vectors are orthogonal to each other. In the presence 
of round-off, a severe loss of orthogonality eventually takes 
place. This loss of orthogonality is responsible for the ap- 
pearance of so-called ghost or spurious eigen-pairs of the ma- 
trix to be inverted. As soon as the linear span of the Lanc- 
zos iterates is large enough as to contain a representation of 
an eigenvector to within numerical accuracy, the subsequent 
steps of the Lanczos process will tend to generate replicas 
of this eigenvector. At this point the Lanczos bases (left 
or right spaces) become linear dependent to within machine 
precision. From the point of view of solving the systems 
[u> — A)x = v, the effect of these replicated eigenvalues is not 
very important. Indeed, when thinking in terms of the BiCG 
algorithm, after the underlying sequence of approximations 
x 3 = Q j (uj - T j y 1 e[ obtained from the BiCG algorithm 
converges to a; = (u> — A)^ 1 v, further iterations will only add 



very small components to xK As a result the contributions of 
these replicas are bound to be negligible and this is observed 
in practice. Thus, ghost eigenvectors have zero (or very small) 
oscillator strengths and their contribution to the wanted inner 
products (u\x 3 ), which approximate g(ui) in Eq. (|56j>, will be 
negligible in general. 



C. Extrapolating the Lanczos recursion chain 

The fast decrease of the components of implies that the 
quality of the calculated spectrum depends only on the first 
few hundreds of them. Specifically, if we set the components 
of the C J vector equal to zero in Eq. ( |73j ) after, say, 300-400 it- 
erations, but we keep the dimension of the tridiagonal matrix, 
T- 7 , of the order of 2-3000, the resulting spectrum appears to 
be still perfectly converged. Unfortunately, a relatively large 
number of iterations seems to be necessary to generate a tridi- 
agonal matrix of adequate dimension. The regular behavior 
of the /3's for large iteration counts suggests an inexpensive 
strategy to extrapolate the Lanczos recursion. Let us fix the 
dimension of the tridiagonal matrix in Eq. f73] > to some very 
large value (say, N* = 10000), and define an effective £ J vec- 
tor, , and T J matrix, T$ , by setting the k-th component 
of e q ua l to zero for k > N, and the fc-th component of 
(3 equal to the appropriate estimate of the asymptotic value 
for odd or even iteration counts, obtained from iterations up 
to N. In general, as previously noted, it very seldom occurs 
that 7j and (3j have a different sign, and we found that that ex- 
trapolating them to the same positive value does not invalidate 
significantly the accuracy of the extrapolation. 

In Fig. |4]we display the spectra, Jjv(w), obtained from the 
extrapolation procedure just outlined, which from now on will 
be referred to as the bi-constant extrapolation of the Lanczos 
coefficients. One sees that the extrapolated spectrum is at per- 
fect convergence already for a very modest value of N in be- 
tween N — 500 and N = 1000, a substantial improvement 
with respect to the results shown in Fig. [T] Note that this ex- 
trapolation procedure, although necessarily approximate, of- 
fers a practical solution to the problem of recovering a con- 
tinuous spectrum from a limited number of recursion steps. 
As the dimension of the tridiagonal matrix appearing in Eq. 
( |73) ) can be made arbitrarily large at a very small cost, the 
distance between neighboring pseudo-discrete eigenvalues in 
the continuous part of the spectrum can be made correspond- 
ingly small, thus allowing to chose the imaginary time of the 
frequency basically as small as wanted. 

A qualitative insight into the asymptotic behavior of the 
Lanczos recursion coefficients can be obtained from the anal- 
ogy with the continued-fraction expansion of the local den- 
sity of states (LDOS) for tight-binding (TB) Hamiltonians, a 
problem that has been the breeding ground for the application 
of Lanczos recursion methods to electronic-structure theory 
ll32l [33l l34l [35). Since the late seventies it has been known 
that the coefficients of the continued-fraction expansion of a 
connected LDOS asymptotically tend to a constant — which 
equals one fourth of the band width — whereas they oscillate 
between two values in the presence of a gap: in the latter case 
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the average of the two limits equals one fourth of the total 
band width, whereas their difference equals one half the en- 
ergy gap llBTfl . These results can be easily verified in the case 
of a ID TB Hamiltonian with constant hopping parameter, {3, 
which leads to the continued fraction: 
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where the sign has to be chosen so as to make the Green's 
function have the proper imaginary part. In this case, one sees 
that the imaginary part of the Green's function (which equals 
the LDOS) is non-vanishing over a band that extends between 
—2(3 and 2/3. In the case were consecutive hopping parameters 
of the recursion chain oscillate between two values, /3j and /3 2 , 
which we assume to be positive, the resulting Green's function 
reads: 
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-/3 2 |and/3 1 +/3 2 



in this case we obtain two bands between \(3\ - 
and between —(/3i + P2) and — \Pi — P 2 \- 

In our case, the relevant band width of the Liouvillian 
super-operator extends from minus to plus the maximum ex- 
citation energy. In a PP-PW pseudo-potential scheme, in turn, 
the latter is of the order of the PW kinetic-energy cutoff, E cut , 
whereas the gap is of the order of twice the optical gap, A. 
We conclude that the asymptotic values for the (3 and 7 coef- 
ficient of the Liovillian Lanczos chain are: ^ 1 +^ 2 ss Miatk anc j 
\Pi — /?2 1 ~ A. In Fig. [5^ we report the behavior of the values 
of the (3 coefficients of the Liouville Lanczos chain calculated 
for benzene, vs. the iteration count, for different plane-wave 
kinetic-energy cutoffs. In Fig. [5J) the average asymptotic 
value is plotted against the kinetic-energy cutoff, demonstrat- 
ing a linear dependence (3^ ss \E cut , in remarkable agree- 
ment with the qualitative analysis described above. Also the 
difference between the asymptotic values for odd and even it- 
eration counts (|/3^ d - /3™ e ™| ~ 0.46 Ry) is in remarkable 
qualitative agreement with the optical gap (A = 0.38 Ry). 



V. APPLICATION TO LARGE MOLECULES : 
FULLERENE AND CHLOROPHYLL A 

In order to demonstrate the applicability of our methodol- 
ogy to large molecular systems, we present now the results ob- 
tained for the prototypical cases of fullerene Cqo and chloro- 
phyll a. 
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Figure 5: (a) Behavior of /3's coefficients of benzene for different 
values of the kinetic energy cut-off. (b) The asymptotic values f3 x 
plotted as a function of the kinetic energy cut-off; the figure shows 
that they can be connected by a straight line with slope of about 0.5. 



Let us begin with fullerene, a molecule whose spectrum has 
already been the subject of extensive experimental BT1 l52l 
and theoretical (22] @I] ED |53] E! |55] studies. Our calcula- 
tions have been performed with the molecule lying in a cubic 
super-cell with side length of 35 ao, using the PBE XC func- 
tional. Ultra-soft pseudo-potentials [44| have been used, with 
a PW basis set with a kinetic energy cut-off of 30 Ry for the 
wavefunctions and 180 Ry for the charge density. This cor- 
responds to almost 60,000 PW's, with a dimension of the full 
Liouvillian exceeding 14 millions. The Lanczos recursion is 
explicitly computed up to different orders, N, as indicated in 
Sec. ~ 



III and then extrapolated up to N* = 20000, as dis- 



cussed in Sec. IV (this value has been chosen rather arbitrarily 
because both the numerical workload and the resulting accu- 
racy depends very little on it, as long as it is large enough). In 
order to regularize the solution of the tridiagonal linear sys- 
tem, Eq. ( |77| ), the spectrum has been calculated at complex 
frequencies whose imaginary part is (also rather arbitrarily) 
taken as e = 0.02 Ry. In Fig. [6^ we report the calculated 
absorption spectrum between and 40 eV. We see that, upon 
bi-constant extrapolation, the calculated spectrum is already 
very good after as few as 500 iterations, and practically in- 
distinguishable from convergence after 1500 iterations. The 
resulting spectrum depends very little on the precise choice 
of e as long as its value is smaller than the distance between 
neighboring eigenvalues of the tridiagonal matrix of Eq. ( |77j ) 
(this distance goes to zero in the continuous portion of the 
spectrum as N* grows large), and larger than the desired res- 
olution of the calculated spectrum. 

The overall shape of our calculated spectrum is in substan- 
tial agreement with that calculated in Refs. Il22l |4T1 l54l us- 
ing the real-time approach to TDDFT. In spite of the small 
atomic basis set used in Ref. Il54l . the number of integra- 
tion steps that was found to be necessary to reach an ac- 
ceptable accuracy (6000) is rather larger than ours. In Refs. 
E2l l4Tll where a real-space grid representation of the KS 
equations was adopted, instead, the number of time steps em- 
ployed is one to two orders of magnitude larger than ours 
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Figure 6: (a) Convergence of the absorption spectrum of fullerene 
calculated between and 40 eV. The curves have been shifted ver- 
tically for clarity, (b) The fully converged absorption spectrum 
of fullerene compared with experimental results 1 5 1 1 in the energy 
range between 2 and 7 eV. For comparison purposes TDDFT results 
have been rescaled in order to have the first transition peak at the 
same height as that of experimental results. Theoretical results have 
been scaled so as to obtain the same integrated intensity as experi- 
mental data. 



(30-40,000). Considering that several Hip products are nec- 
essary at each time step in real-time approaches, whereas 
only two are needed at each Lanczos recursion, we see that 
our combined use of the Liouville-Lanczos algorithm with 
bi-constant extrapolation and ultra-soft pseudopotentials with 
plane waves allows for a substantial reduction of the numeri- 
cal workload, while keeping the full accuracy allowed by the 
XC functionals currently available. 

The absorption spectrum of Ceo is characterized by a low- 
lying and well structured portion (between, say, 3 and 7 eV) 
dominated by ir — > tt* transitions, followed by a broader fea- 
ture between 14 eV and 27 eV determined by transitions from 
both a and tt molecular orbitals. In Fig. |6|b) we compare 
our converged spectrum with the experimental results of Ref. 
||5T1 . Despite a slight redshift compatible with that found in 
the calculations of Ref. BP . the overall shape of the TDDFT 
spectrum is in good agreement with experiment. Note that the 
theoretical results reported in Ref. [51], which were obtained 
by calculating individual eigen-pairs of the Casida's equation, 



could hardly be extended to such a broad energy range as cov- 
ered in the present calculation, because too many lines would 
have to be calculated. 
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Figure 7: (a) Convergence of the chlorophyll absorption spectrum 
between and 40 eV. The curves have been shifted vertically for 
clarity, (b) Chlorophyll absorption spectrum in the visible region for 
wavelengths between 400 and 700 nm compared with the experimen- 
tal data in di-ethyl ether of Ref. [56 1. Theoretical results have been 
scaled so as to obtain the same integrated intensity as experimental 
data. 

An even more challenging test is chlorophyll, a molecule 
which is of fundamental importance for life on Earth since 
it is responsible for the photosynthetic process. There are 
several different forms of this molecule, and we will fo- 
cus on chlorophyll a (C55H72MgN4o). Historically the in- 
terpretation of the visible spectrum of chlorophyll relies on 
the 4-orbital Gouterman model of porphyrins ll57l in which 
only the two highest occupied molecular orbitals and the 
two lowest unoccupied molecular orbitals are considered. In 
the last few years there have been several calculations of its 
low energy spectrum relying on different ab initio techniques 
ED EH iUl ED |62] |H. Despite the fact that TDDFT seems 
to produce spurious charge transfer states in the visible region 
iRm . according to our calculations the overall shape of the low 
energy part of spectrum seems to be correctly predicted. Our 
calculations have been performed using a super-cell of dimen- 
sions 35 x 45 x 55 a;j with the PW91 XC functional J64] and 
USPP's [44|. Molecular orbitals where expanded in PW's up 
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to a kinetic energy cut-off of 30 Ry, while 180 Ry were used 
for the charge density. The PW basis sets consists of more 
than 120000 PW's, while the dimension of the Liovillian su- 
peroperator exceeds 42 millions. In this case the imaginary 
part of the frequency was set to e = 0.002 Ry to better com- 
pare the results with experiments. In Fig. |7|a) we display 
the convergence of the spectrum with respect to the number 
of Lanczos steps, using the usual bi-constant extrapolation of 
the coefficients, as calculated over a wide range of energy be- 
tween and 40 eV. In Fig. |7Jb) we compare the visible part of 
the spectrum calculated in this work with the experimental re- 
sults obtained in diethyl solution in Ref. |56|. The agreement 
with experiment is clearly good but the Soret (B) band located 
in the indigo region of the spectrum at 430 nm is slightly red- 
shifted in the calculation, while the red band (Q) has an op- 
posite, blue-shifted behavior. How much of this discrepancy 
has to be attributed to the limitations of the AXCA alone, or 
to a combination of them with the neglect of solvation effects 
remains to be ascertained. 



VI. CONCLUSIONS 

In this paper we have presented a new algorithmic ap- 
proach to linearized TDDFT that combines the advantages 
of the more conventional real-time and Casida's eigenvalue 
methods, while avoiding many of their drawbacks. This ap- 
proach results from the combination of many elements which 
are individually not new in different communities, ranging 
from condensed matter and quantum chemistry, to control 
theory/engineering and signal processing. In particular it is 
the natural extension to the dynamical regime of density- 
functional perturbation theory, a method made popular in 
the condensed-matter community by the calculation of static 



properties (such as dielectric, piezoelectric, elastic) and by 
the calculation of phonons and related properties in crystals. 
The main features of the new method are that it is tailored to 
the calculation of specific responses to specific perturbations 
and that the computational burden for the calculation of the 
complete spectrum of a given response function in a wide fre- 
quency range is comparable to that of a single static ground- 
state or response-function calculation. We believe that, from 
the algorithmic point of view, the new method is close to op- 
timal in its application range and that it opens thus the way to 
the simulation of the dynamical properties of large and very 
large molecular and condensed-matter systems. Assuredly, it 
cannot yield any better results than granted by the quality of 
the XC functional used to implement it. Devising new XC 
functionals capable of properly describing the electron-hole 
interaction responsible, e.g., of Rydberg and exci tonic effects 
in the low-lying portion of the spectrum of molecular and ex- 
tended systems, respectively, remains a major problem to be 
addressed and solved. 
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